Goto

Collaborating Authors

 mistral ai


The crucial first step for designing a successful enterprise AI system

MIT Technology Review

How to identify the first iconic use case for an enterprise AI transformation. Many organizations rushed into generative AI, only to see pilots fail to deliver value . Now, companies want measurable outcomes--but how do you design for success? At Mistral AI, we partner with global industry leaders to co-design tailored AI solutions that solve their most difficult problems. Whether it's increasing CX productivity with Cisco, building a more intelligent car with Stellantis, or accelerating product innovation with ASML, we start with open frontier models and customize AI systems to deliver impact for each company's unique challenges and goals. Our methodology starts by identifying an iconic use case, the foundation for AI transformation that sets the blueprint for future AI solutions.


Assessing Socio-Cultural Alignment and Technical Safety of Sovereign LLMs

arXiv.org Artificial Intelligence

Recent trends in LLMs development clearly show growing interest in the use and application of sovereign LLMs. The global debate over sovereign LLMs highlights the need for governments to develop their LLMs, tailored to their unique socio-cultural and historical contexts. However, there remains a shortage of frameworks and datasets to verify two critical questions: (1) how well these models align with users' socio-cultural backgrounds, and (2) whether they maintain safety and technical robustness without exposing users to potential harms and risks. To address this gap, we construct a new dataset and introduce an analytic framework for extracting and evaluating the socio-cultural elements of sovereign LLMs, alongside assessments of their technical robustness. Our experimental results demonstrate that while sovereign LLMs play a meaningful role in supporting low-resource languages, they do not always meet the popular claim that these models serve their target users well. We also show that pursuing this untested claim may lead to underestimating critical quality attributes such as safety. Our study suggests that advancing sovereign LLMs requires a more extensive evaluation that incorporates a broader range of well-grounded and practical criteria.


Chatbots work best when you speak to them with formal language

New Scientist

Are you terse and informal when speaking to an AI chatbot? Talking to an AI chatbot in less formal language, as many people do, reduces the accuracy of its responses - suggesting that either we need to be linguistically stricter when using a chatbot, or that the AIs need to be trained to better adapt to informality. Fulei Zhang and Zhou Yu at Amazon looked at how people begin conversations with human agents compared with a chatbot assistant powered by a large language model (LLM). They used the Claude 3.5 Sonnet model to score the conversations on a range of factors and found that people interacting with chatbots used less accurate grammar and were less polite than they were when addressing humans. They also used a slightly narrower range of vocabulary.


AI chatbots fail to diagnose patients by talking with them

New Scientist

Advanced artificial intelligence models score well on professional medical exams but still flunk one of the most crucial physician tasks: talking with patients to gather relevant medical information and deliver an accurate diagnosis. "While large language models show impressive results on multiple-choice tests, their accuracy drops significantly in dynamic conversations," says Pranav Rajpurkar at Harvard University. That became evident when researchers developed a method for evaluating a clinical AI model's reasoning capabilities based on simulated doctor-patient conversations. The "patients" were based on 2000 medical cases primarily drawn from professional US medical board exams. "Simulating patient interactions enables the evaluation of medical history-taking skills, a critical component of clinical practice that cannot be assessed using case vignettes," says Shreya Johri, also at Harvard University.


Propaganda is all you need

arXiv.org Artificial Intelligence

As ML is still a (relatively) recent field of study, especially outside the realm of abstract mathematics, few works have been led on the political aspect of LLMs, and more particularly about the alignment process, and its political dimension. This process can be as simple as prompt engineering, but also very deep and affect completely unrelated questions. For example, politically directed alignment has a very strong impact on an LLM's embedding space, and the relative position of political notions in such a space. Using special tools to evaluate general political bias and analyze the effects of alignment, we can gather new data to understand its causes and possible consequences on society. Indeed, leading a socio-political approach we can hypothesize that most big LLMs are aligned on what Marxist philosophy calls the 'dominant ideology'. As AI's role in political decision-making, at the citizen's scale but also in government agencies, such biases can have huge effects on societal change, either by creating a new and insidious pathway for societal uniformization or by allowing disguised extremist views to gain traction on the people.


Microsoft Strikes Deal with France's Mistral AI

TIME - Tech

Microsoft announced an artificial intelligence partnership Monday with the French startup Mistral AI that could lessen the software giant's reliance on ChatGPT-maker OpenAI for supplying the next wave of chatbots and other generative AI products. Mistral AI emerged less than a year ago but is already what Microsoft described Monday as an "innovator and trailblazer" at the vanguard of building more efficient and cost-effective AI systems. Microsoft and Mistral didn't disclose the financial terms of the deal, though Microsoft said it involves a small investment in the Paris-based startup. That suggests it is far smaller than Microsoft's investment of billions of dollars into OpenAI, a years-long relationship that has attracted the scrutiny of antitrust regulators in the U.S. and Europe. Mistral on Monday released a public test version of its own chatbot, called Le Chat, that apparently was flooded with so much interest that a company executive said it was temporarily unavailable for part of the day.


Why Europe Must Not Let AI Firms Put Profits Before People

TIME - Tech

The soap opera-like ousting and swift return of OpenAI CEO Sam Altman produced plenty of fodder for ironic quips online but it also exposed some serious fault lines. One such critique I enjoyed was: "How are we supposed to solve the AI alignment problem if aligning just a few board members presents an insurmountable challenge?" As the company behind ChatGPT, OpenAI may be one of the more recognizable names, but artificial intelligence is more than one company. It's a technology of immense consequence, yet it remains almost entirely unregulated. The E.U. has a chance to meaningfully tackle that challenge--but not if it bends the knee to Big Tech's ongoing onslaught. Inspirational Members of the European Parliament have so far been standing firm in the face of incredible pressure, in an effort to save this landmark legislation.


E.U.'s AI Regulation Could Be Softened After Pushback From Biggest Members

TIME - Tech

A key aspect of the E.U.'s landmark AI Act could be watered down after the French, German, and Italian governments advocated for limited regulation of the powerful models--known as foundation models--that underpin a wide range of artificial intelligence applications. A document seen by TIME that was shared with officials from the European Parliament and the European Commission by the three biggest economies in the bloc over the weekend proposes that AI companies working on foundation models regulate themselves by publishing certain information about their models and signing up to codes of conduct. There would initially be no punishment for companies that didn't follow these rules, though there might be in future if companies repeatedly violate codes of conduct. They are some of the most powerful, valuable and potentially risky AI systems in existence. Many of the most prominent and hyped AI companies--including OpenAI, Google DeepMind, Anthropic, xAI, Cohere, InflectionAI, and Meta--develop foundation models.